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Abstract 

We investigate the use of the event topology as a tool in the search for the six-jet 
decay of top-pair production in proton-antiproton collisions at 1.8 TeV. Modified Fox- 
Wolfram "shape" variables, Hi, are employed to help distinguish the top-pair signal 
from the ordinary QCD multi-jet background. The Hgs can be constructed directly 
from the calorimeter cells or from jets. Events are required to lie in a region of Hi- 
space defined by Li < Hi < R# for £ = 1, . . . , 6, where the left, L#, and right, Rf, cuts 
are determined by a genetic algorithm (GA) procedure to maximize the signal over the 
square root of the background. We are able to reduce the background over the signal to 
less than a factor of 100 using purely topological methods without using jet multiplicity 
cuts and without the aid of b-quark tagging. 
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1 Introduction 



The challenge at hadron colliders is to disentangle any new physics that may be present 
from the "ordinary" QCD background. Hadron collider events can be very complicated and 
quite often one has the situation where the signal is hiding beneath the background. In 
addition, there are many variables that describe a high energy collider event and it is not 
always obvious which variables best isolate the signal or precisely what data selection (or 
cuts) optimally enhance the signal over the background. In this paper, we use information 
on the event topology to help enhance the signal over the background. We define six 
modified Fox- Wolfram "shape" variables, H£, i = 1, . . . , 6, to characterize the topology of 
the event. The fig's can be constructed directly from the calorimeter cells or from the 
jets. To illustrate our techniques, we will attempt to isolate the six-jet decay of top-pair 
production in proton-antiproton collisions at 1.8 TeV from the ordinary QCD multi-jet 
background, without tagging b-quarks. B-quark tagging would, of course, further enhance 
the signal to background ratio. 

The top quark decays into a 5-quark and a W boson (t — > bW). The W boson decays 
into a lepton (e or fj,) and a neutrino about 22% ( 2 /g) of the time and into a quark-antiquark 
pair about 67% ( 6 /g) of the time. This implies that when top-pairs are produced in hadron- 
hadron collisions, pp — > ii+X, both of the W bosons decay into a lepton and neutrino only 
about 5% of the time resulting in the final state consisting of two leptons, two neutrinos, 
and two b-quarks (tlvvbb). This distinctive final state constitutes the "discovery" mode of 
the top quark at hadron colliders jl], [2j. On the other hand, it is considerable more likely 
for one of the W bosons to decay into a quark-antiquark pair resulting in a final state 
consisting of a lepton, a neutrino, a bb, and a qq pair. The Ivbbqq mode occurs about 35% 
of the time or about 7 times more often than the purely leptonic mode. The backgrounds 
are larger for this decay mode, but so is the signal. When each of the four outgoing quarks 
produce a distinct jet, the resulting event contains a lepton, a neutrino, and four jets 
(ivjjjj)- This decay mode is used to analyze the properties of the top quark in more detail 
and to determine, for example, the top mass f|, ||, §]. The purely hadronic decay mode 
shown in Fig. |l] occurs about 60% of the time, and produces the "six-jet" topology shown 
in Fig. ^. The six-jet decay mode of top-pair production is buried underneath "ordinary" 
QCD multi-jet production such as that illustrated in Fig. ||. 

We will attempt to isolate the tt six-jet mode from the background using only the event 
topology. The signal in Fig. |l| contains b quarks whereas the QCD multi-jet background 
in Fig. ^, in general, does not. Therefore, 6-quark tagging will improve the signal to 
background ratio. However, we would like to investigate how well one can do using only 
the event topology. We begin our analysis of the signal and background in Section II 
with a discussion of the event simulation and detection. In Section III, we define the fig 
variables that characterize the collider event topology and in Section IV we discuss the 
genetic algorithm (GA) that we use to find optimal regions of He-space. We reconstruct 
the top-pair invariant mass in Section V and in Section VI we isolate the top-pair topology 
by making fig cuts. Section VII is reserved for summary and conclusions. 
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Figure 1: Illustration of top-pair production in proton-antiproton collisions in which both of the W bosons 
decay hadronically resulting in a final state consisting of a bb pair and two qq pairs. 




Figure 2: Shows the event topology for the top-pair signal. If each of the outgoing partons produces a 
distinct jet, then the final state contains six jets. 
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Figure 3: Illustration of the QCD multi-jet background to the top-pair production in proton-antiproton 
collisions shown in Fig. |lj 

2 Event Simulation and Detection 

ISAJET version 7.06 Q is used to generate top quarks with a mass of 175 GeV in 1.8 TeV 
proton-antiproton collisions. At this energy, 175 GeV top-pairs are produced via quark- 
antiquark annihilation, qq — ► tt, about 88% of the time and by gluon-gluon fusion, gg — ► tt, 
the remaining 12%. We refer to this as the "signal". We have normalized the top cross 
section to be 7.5 pb corresponding to 750 events with an integrated luminosity of 100/pb 

^, |6|. The "background" consists of ordinary QCD multi-jet events generated using 
ISAJET with the hard-scattering transverse momentum, hx, greater than 20 GeV. ISAJET 
uses the "leading pole" approximation to produce multi-jets and not the exact matrix 
elements. (The 2 — > 2 matrix elements are exact but not the 2 — > N with N > 2.) Because 
of this the precise numbers in this paper should not be taken too seriously. Nevertheless, 
ISAJET is sufficient to illustrate our techniques. 

We do not attempt to do a detailed simulation of the CDF or DO detector ||. 
Events are analyzed by dividing the solid angle into "calorimeter" cells having size ArjAcj) = 
0.1 x 7.5°, where r] and <f> are the pseudorapidity and azimuthal angle, respectively. Our 
simple calorimeter covers the range \rj\ < 4 and has 3840 cells. A single cell has an energy 
(the sum of the energies of all the particles that hit the cell excluding neutrinos) and a 
direction given by the coordinates of the center of the cell. The transverse energy of each 
cell is computed from the cell energy and direction. We have taken the energy resolution 
to be perfect, which means that the only resolution effects are caused by the lack of spatial 
resolution due to the cell size. 
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3 Variables that Characterize the Event Topology 



3.1 Fox- Wolfram Moments 

In 1979 Geoffrey Fox and Stephen Wolfram [7j constructed a complete set of rotationally 
invariant observables, Hi, which can be used to characterize the "shapes" of the final states 
in electron-positron annihilations. They are constructed from the momentum vectors, p, of 
all the final state particles as follows, 

/ A \ +^ particles , _ , 

*= G£t) E I £ ^)M| , (1) 

where the inner sum is over the particles produced and Y» m are the spherical harmonics. 
Here one must choose a particular set of axes to evaluate the angles, Oj = (6>j,c/>j), of the 
final state particles, but the values of the Hi are independent of this choice. These moments 
lie in the range < Hi < 1 and if energy conserved in the final state then Hq = 1 (neglecting 
the masses). If momentum is conserved in the final state then Hi = 0. 

The Fox- Wolfram observables (or moments) constitute a complete set of shape parame- 
ters. For example, the collinear "two-jet" final state results in Hi ~ 1 for even I and Hi ~ 
for odd I. Events that are completely spherically symmetric give Hi ~ for all I. 



3.2 Constructing Fox- Wolfram Moments from Calorimeter Cells 

In hadron-hadron collisions spherical symmetry is lost and we are interested more in the 
shape of events in the transverse plane. For example, the Fox- Wolfram moments when 
applied directly to hadron-hadron collisions would interpret a minimum bias event as a 
"two-jet" event, whereas we would like to have a minimum bias event treated more like a 
spherically symmetric e + e~ final state (i.e., no structure). To accomplish this, we define 
the following modified Fox- Wolfram moments for hadron-hadron collisions, 

/ ZL-TT \ , CellS 9 

*»)=^tEE™^|, (2) 

m=—t i v ' 

where the inner sum is over all the calorimeter cells in the event with transverse energy, 
EJp, greater than some minimum (for example, 5 GeV) and S7j = (8i,4>i) are the angular 
locations of the center of the cell. In this case, Exisum) is the total transverse energy of 
all the cells that are included in the sum. The calorimeter cells contain all the information 
concerning the topology of the event and it is not necessary to define jets. These modified 
moments also lie in the range < Hi < 1 and by definition Hq = 1. 

Table [l] shows the mean values and standard deviations for six of the modified Fox- 
Wolfram moments calculated using all cells with Er{cell) > 5 GeV for the top-pair signal 
and the QCD multi-jet background. The mean values of the six moments Hi, . . . ,Hq are 
considerably smaller for the signal than the background. For our calorimeter ( 3849 cells 
with ArjA(j) = 0.1 x 7.5°) equal transverse energy in every cells yields Hi = for odd 
I and H2 = 0.39, H^ = 0.23, Hq = 0.15. This corresponds to a cylindrically symmetric 
"blob" . The signal lies closer to this "blob" configuration in Hi-space than does most of the 
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Ht(cell) 
Signal Background 


HfXjet) Rj=0A 
Signal Background 


Hi 


0.2053±0.0797 0.3160±0.1698 


0.1970±0.0784 0.3104±0.1748 


TT 

n-2 


0.2827±0.1093 0.5479±0.3581 


0.27ll±0.l046 0.5557±0.3737 


Hi 


0.2670±0.0951 0.3849±0.1883 


0.2593±0.0934 0.3890±0.l985 


H 4 


0.2738±0.0959 0.4774±0.2670 


0.27l3±0.0976 0.4937±0.2894 


H 5 


0.2688±0.0908 0.4058±0.1946 


0.2723±0.0964 0.4223±0.2l50 


H 6 


0.2640±0.0867 0.4463±0.2296 


0.2744±0.0965 0.4738±0.26l2 



Table l: Shows the mean value and standard deviation from the mean (meanicr) of six of the Modified 
Fox- Wolfram moments, He, constructed from the calorimeter cells (with Er(cell) > 5 GeV) and from jets 
with Rj = 0.4 and Erijet) > 15 GeV. Results are shown for the top-pair signal and the QCD multi-jet 
background in 1.8 TeV proton-antiproton collisions. 



background events. The background contains many two, three, and four jet configurations 
in addition to some higher jet multiplicity configurations. The top-pair transverse energy 
deposition is usually more spread out in rj-<fi space than the background. This can be seen 
in Figs. |I| , and || which show the H2, and distributions, respectively, for the signal and 
background. In a given event, all six moments are, on the average, small for the top-pair 
signal, whereas for the background usually at least one of the moments is large. This can be 
seen in Fig. || which shows the distribution of the maximum of the six moments, Hi, . . . , Hq, 
in each event for the top-pair signal and the QCD multi-jet background. 
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Figure 4: Shows the modified Fox- Wolfram moment, H2, calculated directly from the calorimeter cells with 
Et{ccU) > 5 GeV for top-pair signal and for the QCD multi-jet background. The plot shows the percentage 
of events in a 0.05 bin with the sum of all bins normalized to 100%. 
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Modified H4 Applied to Cells 
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Figure 5: Shows the modified Fox- Wolfram moment, f/4, calculated directly from the calorimeter cells with 
Et{ccII) > 5 GeV for top-pair signal and for the QCD multi-jet background. The plot shows the percentage 
of events in a 0.05 bin with the sum of all bins normalized to 100%. 
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Figure 6: Shows the largest of the first six modified Fox- Wolfram moments, Hi, for I = 1, ... ,6 in each 
event calculated directly from the calorimeter cells with Et{ccII) > 5 GeV for top-pair signal and for the 
QCD multi-jet background. The plot shows the percentage of events in a 0.05 bin with the sum of all bins 
normalized to 100%. 
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3.3 Constructing Modified Fox- Wolfram Moments from Jets 

Instead of using the calorimeter cells directly to characterize the event topology one can 
define "jets" and use them to construct modified Fox- Wolfram moments. We define jets 
using a simple algorithm. One first considers the "hot" cells (those with transverse energy 
greater than 5 GeV). Cells are combined to form a jet if they lie within a specified "radius" 
R? = A77 2 + A(f> 2 in r]-(f) space from each other. Jets have an energy given by the sum of 
the energy of each cell in the cluster and a momentum pj given by the vector sum of the 
momentums of each cell. The invariant mass of a jet is simply Mj = Ej — pj ■ pj. In this 
analysis, we examine both "narrow", Rj = 0.4, and "fat", Rj = 0.7, jets, where jets are 
required to have at least 15 GeV of transverse energy. 

The modified Fox- Wolfram moments are constructed from jets as follows, 

/ A.? \ +£ jets w i 

*.)=^EE>rwi|, (3) 

m=—l i s ' 

where the inner sum is now over all the jets in the event with transverse energy, E l T , 
greater than some minimum (which we take to be 15 GeV) and Oj = ^) are the angular 
locations of the jets. Here, E^sum) is the sum of the transverse energy of all the jets that 
are included in the sum. 

Table [l] shows the mean values and standard deviations for six of the modified Fox- 
Wolfram moments calculated using all jets with Rj = 0.4 and ET(jet) > 15 GeV for the 
top-pair signal and the QCD multi-jet background. The mean values are similar to those 
constructed directly from the cells and as before the mean values of the six moments 
Hi, . . . , Hq are considerably smaller for the signal than for the background. 

One can use the modified Fox- Wolfram moments constructed either from the cells or 
from jets. In either case the H^s characterize the topology of the event. At this point one 
could make a simple cut on H^max) to enhance signal over background (see Fig. ||), but 
one can do better by considering all six moments. The six moments Hi, . . . ,H§ form a 
six dimensional space in which different regions of the space correspond to different event 
topologies. They range from zero to one and make excellent inputs into a neural network 
or Fisher discriminate ||]. In this paper, we will restrict events to lie within a region of the 
six dimensional if^-space. The region will be defined by Li < Hi < Ri for I = 1, . . . , 6. The 
left, Li, and right, Ri, cuts will be selected using a genetic algorithm (GA) to maximize 
the signal over the square root of the background. 

4 Multi-Dimensional Linear Cuts and Genetic Algorithms 

Genetic Algorithms are a broad class of minimization algorithms modeled after genetics and 
evolution [||, [l(|. In this paper, we will use a GA to perform "optimal" multi-dimensional 
linear cuts. In particular, we are interested in finding a set of left, Lg, and right, Ri, cuts 
(I = 1,...,6) that maximizes the signal, N s i g , over the square root of the background, 
\/Nbak (i-e., the statistical significance). 

Unlike local algorithms, such as the Gradient Descent algorithm, GA's are much less 
likely to find and stay in a local minimum. This is a considerable advantage for a large 
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class of problems, including our particular application. At the same time, GA's have local 
properties which make it possible to find and refine the "optimal" solutions in a reasonable 
time, while at the same time not precluding the possibility that there might be an even 
better solution. 
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Figure 7: Crossover of two parental genes. A split position is chosen at random within the genes of the 
parents. The child receives all the bits to the left from one parent and all the bits to the right from the 
other parent. Shows the two genes Li = 0.251 and Ri = 0.5059 for one parent and Li =0.0 and Ri = 1.0 for 
the other parent . For this crossover, the children receive an unchanged Li (one gets Li = 0.251 and the 
other gets Li =0.0) and both get a modified Ri that is a combination of the parental bits (Ri =0.5137 for 
the one child and Ri = 0.9922 for the other). 

To use a GA, one must have a set of data and a parametric real valued function on that 
data. In our case, the data is the set of six modified Fox- Wolfram moments Hi , . . . , Hq 
for 10, 000 top-pair signal events and 10, 000 QCD multi-jet background events. The real 
valued function, Rf, on the data is the number of signal events over the square root of 
the number of background events that lie within a region of the six dimensional -f/^-space 
defined by Lg < Hi < Ri for I = 1, . . . , 6. Namely, 

R f = -^i=,whereL i <H e <R i , £ = 1,...,6. (4) 

V^bak 

In biological terms, the set of signal and background events is the environment in which 
a population resides, the real valued function, Rf, is analogous to the overall fitness of an 
individual for survival and reproduction, and the 12 parameters Lg and Ri [l = 1, ... ,6) 
are the genes of an individual. Since each of the left, Li, and right, Ri, cuts lie between 
zero and one, we can multiply them by 255 and represent them as a single byte (eight bits) 
within the computer]]]. For example, the gene corresponding to the left cut Li = 0.251 is 
represented in the computer as follows: 

Li = 0.251 -» [01000000]. (5) 

x In our calculations we use two bytes to represent each real number, but for illustration it is simpler to 
consider just one byte. 
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Each individual has a set of 12 genes corresponding to the 6 pairs of left and right cuts. 
These 12 genes form the "DNA" of the individual which is represented in the computer as 
a string of 12 bytes. For example, all left cuts of zero and all right cuts of one looks like 
the following: 

Li,i?i,...,L 6 ,i? 6 [00000000][11111111]...[00000000][11111111]. (6) 

Finding an "optimal" solution or solutions is achieved through genetic evolution of 
a population over many generations. Typically, we use 500 to 1,000 individuals with an 
average life span of several simulation years and evolve them through 50 to 100 generations. 
The genetic evolution of the population is achieved through the following mechanisms: 

• Natural selection: At the end of each simulation year, the individuals with the worst 
performance are given the highest chance of dying and, therefore, their effect on future 
generations is minimized. We do not, however, exterminate the worst performers 
unconditionally as is sometime done. This usually decreases the convergence property 
of the GA, since "good" genes often require time before they lead to "optimal" results. 

• Reproduction: Each simulation year, depending on the population size, individuals 
reproduce by selecting a mate. Individuals with higher performance have a higher 
probability of being selected, which further enhances the convergence property of the 
GA. If the population is bigger, the rate of reproduction is smaller and vice versa. 
This has the effect of better convergence because a population is small if many of the 
individuals do not perform well (which happens either at the initial stage of training, 
or when an already trained population discovers a new, much better solution, which 
makes all other individuals bad performers). During reproduction, the following two 
factors are critical: 

— Crossover: The new individual inherits certain genes from one parent and 
others from the other. This has the effect on both global and local property of 
the GA, since on one hand the "good" genes are preserved (local), while on the 
other hand new combinations are formed which has the effect of spanning the 
entire parameter space (global). The probability of a crossover is determined by 
the crossover rate, R c . 

— Mutation: The new individual has some of its genes randomly modified. This 
is an extremely important factor in GA's, since this is the primary mechanism of 
discovering radically new solutions, which, if good, eventually start dominating 
the population. If not good, the individuals that carry them die earlier due to 
natural selection and also due to the smaller probability for reproduction. The 
probability of a mutation is determined by the mutation rate, R m . The lower 
the mutation rate, the more local the GA is and vice versa. 

In the computer both crossover and mutation are bit-level operations on the genes 
(bytes). For example, Fig. [?] shows the crossover of genes from two parents. A split 
position is selected at random and the child inherits all bits before the split from one of 
the parents, and all bits after the split from the other. The gene affected by the split 
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becomes a combination the parental genes, while all other genes are inherited unaltered. 
This mechanism has the effect of both local convergence to an already "good" solution (if the 
high bits of both parents are the same), and at the same time exploring new combinations 
on one of the parameters only. After crossover, the child's genes are mutated by randomly 
changing some of its bits. The probability for mutation is usually small in order to allow 
natural selection and crossover to find the best gene combinations in the already existing 
genetic pool of the population. However, the mutation rate should not be zero in order to 
continuously probe the entire space of parameters for potentially better solutions that are 
not yet part of the genetic pool. 

For most GA implementations the crossover rate, R c , and the mutation rate, R m , are 
fixed parameters. However, this can result in poor convergence properties, since these rates 
(especially the mutation rate), have different effects on a non-trained population and an 
already trained one. During the first generations of training, it is desirable that these rates 
be high in order to quickly scan the parameter space globally. At later stages, when the 
genetic pool of the population is "good" , too high rates interfere with the preservation of the 
genetic pool and the performance is detrimented significantly. A "fine-tuning" of these rates 
as a function of simulation time is impractical, so instead we let both the crossover rate and 
the mutation rate be genes themselves. In other words, their value at any point in time is 
subject to the same evolution as the parameters of the problem itself. This has a dramatic 
positive effect on the convergence property of the GA. Initially, when the genetic pool is 
random, the values of the crossover and mutation rates are very high (50% on the average). 
This allows for a very fast global scanning of the entire parameter space. As the genetic 
pool improves, individuals with genes corresponding to high crossover and mutation rates 
(even if they are good performers otherwise), produce offspring which significantly deviates 
from the parental genes and, in all likelihood, does not perform as well. In subsequent 
simulation years, this offspring is disadvantaged due to natural selection as well as mate 
selection and thus its genes are not likely to be passed on to future generations. On the 
other hand, individuals with good performance and reasonable crossover and mutation rates 
are more likely to produce offspring reflecting their genetic make-up and, therefore, have 
a much higher probability for their offspring surviving and reproducing. This process is 
dynamic and the population constantly changes the crossover and mutation rates. 

Furthermore, after crossover and mutation, we shift one of the left, Lg, or right, Rp, 
cuts by an amount A, 

L £ -> L £ ± A or R t -> R t ± A. (7) 

This constitutes an implicit local algorithm, which allows for faster refinement of the solu- 
tion and it further enhances the convergence properties of the GA algorithm. In addition, 
we let the value of A also be a gene so that the complete "DNA" of an individual consists 
of the 15 genes, R m , R c , A, L\, R±,. . . , Lq, Rq. The population discovers dynamically 
the best values for the mutation rate, R m , the crossover rate, R c , and A. In particular, 
during the initial stages of training, A is totally irrelevant, since the major force of change 
is mutation. During later stages, the implicit local algorithm helps to refine an already 
good solution. At the final stages of evolution, the crossover and mutation rates are very 
low, and the improvement is dominated by A, until eventually even the local algorithm 
cannot improve the solution any more. In that case, A itself becomes very small. This 
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constitutes the criterion that further evolution is not likely to improve the population any 
further. 

Table |2| shows the left, Li, and right, Ri, cuts (£ = 1, . . . , 6) on the .f^'s determined 
from our genetic algorithm (GA) procedure to maximize signal over the square root of the 
background. We consider three cases. In the first case the modified Fox- Wolfram moments 
are constructed directly from the calorimeter cells, Hi(cell), with Et((xII) > 5GeV. The 
other two cases are for modified Fox- Wolfram moments constructed from "narrow" , Rj = 
0.4, jets and from "fat", Rj = 0.7, jets, Hi(jet), with E T (jet) > 15GeV. 





Hi(cell) Cuts 
Left (L) Right (R) 


H e (jet) Cuts Rj = 0A 
Left (L) Right (R) 


Hiijet) Cuts Rj = 0.7 
Left (L) Right (R) 




0.000198 0.347951 


0.000000 0.216602 


0.007355 0.217731 


H 2 


0.011261 0.225223 


0.000000 0.218647 


0.000000 0.256138 


H 3 


0.013932 0.249973 


0.000000 0.265553 


0.009720 0.160235 


H 4 


0.000565 0.588556 


0.043092 0.381796 


0.021011 0.491890 


H 5 


0.051927 0.192233 


0.000000 0.288945 


0.018845 0.395163 


H 6 


0.026032 0.912840 


0.081071 0.726467 


0.026642 0.794415 



Table 2: Shows the He cuts determined from a genetic algorithm (GA) to maximize the signal over the 
square root of the background. The He's are restricted to lie in the region Le < He < Re for £ — 1, . . . , 6 
and are constructed from the calorimeter cells directly, He(cell), or from "narrow", Rj =0.4, jets and from 
"fat", iij=0.7, jets(^O'et)). 



5 Reconstructing the Top-Pair Invariant Mass 

The top-pair invariant mass, M t i, corresponds to the center-of-mass energy, E cm , of the 
underlying parton-parton two-to-two subprocess which has a threshold at twice the mass 
of the top quark, E cm > 2M top . Although one cannot precisely reconstruct the parton- 
parton CM energy, the hope is that one will be able to observe a peak in the reconstructed 
top-pair invariant mass at twice the top quark mass. The size of this peak relative to the 
background determines whether this mode can be seen. The top-pair-invariant mass can 
be reconstructed from the outgoing jets or directly from the calorimeter cells. 

5.1 Using the Calorimeter Cells Directly 

The parton-parton invariant mass can be constructed directly form the calorimeter cells as 
follows: 

Mtf = Pcells ~ Pcells > (8) 

where 

cells 

Pcells = ^2 Pi, (9) 
% 
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and 

cells 

Ecells = J! Ei- (10) 
i 

The overall cell energy, E ce u s , and momentum, p ce iis, is constructed by summing over all 
cells with transverse energy greater that some minimum (witch we take to be 5GeV). 

5.2 Using the Outgoing Jets 

The top-pair invariant mass, M t f, can be constructed from the energy and momentum of 
the outgoing jets in the event as follows: 

Ml = E] ets -p 2 ets , (11) 

where 

jets 

Pjets = J2pi, (12) 

i 

and 

jets 

Ejets = y^.Ej. (13) 

i 

The overall jet energy, Ej ets , and momentum, Pj e ts, is constructed by summing over all jets 
with transverse energy greater than 15GeV. 



Rj = 0.7 ET(jet)> 15 GeV 



Multiplicity of Jets 
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Figure 8: Shows the multiplicity of "fat" jets (Rj =0.7) with transverse energy greater than 15 GeV for 
the top-pair signal and the QCD multi-jet background. The plot shows the percentage of events with 
jets with E T (jet) > 15 GeV. 
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Mass Type 


Mass Range 


N sig 


N bak 


N bak /N sig 


N sig /^N bah 


Jet Cuts Nj > 5 


Jet Mass 


> 300 GeV 


364 


444,551 


1,221 


0.55 


Rj = 0.7, E T (jet)> 15 GeV 














H((cell) Cuts 


Cell Mass 


> 250 GeV 


54 


4,621 


85 


0.80 


E T (cell)>5GeV 














Hi(cell) Cuts 


Jet Mass 


> 300 GeV 


65 


8,138 


125 


0.72 


Rj=OA, E T (cell)>5GeV 














H e (jet) Cuts 


Jet Mass 


> 300 GeV 


105 


17,578 


168 


0..79 


Rj =0.4, E T (jet) > 15 GeV 














H e (jet) Cuts 


Jet Mass 


> 300 GeV 


87 


31,843 


365 


0.49 


Rj = 0.7, E T {jet) > 15 GeV 















Table 3: 175 GeV top quark pairs produced in 1.8 TeV proton-antiproton collisions. The table shows the 
number of events (with C = 100/pb) for the top-pair signal and the QCD multi-jet background remaining 
after making a jet multiplicity cut (Nj > 5, Rj = 0.7, Er(jet) > 15 GeV and after making various Hi cuts. 
The He's are constructed from the calorimeter cells directly, Hg(cell), or from narrow, Rj =0.4, jets and 
from fat, Rj =0.7, jets, He(jet). The Hi 's are restricted to lie in the region Le < Hi < Re for I = 1, . . . , 6, 
where the left, Li, and right, Ri , cuts are selected using a genetic algorithm (GA) to maximize the signal 
over the square root of the background and are given in Table The top-pair invariant mass is calculated 
either directly from the cells (cell mass) or from the jets (jet mass). 

6 Isolating Multi-Jet Topologies 
6.1 Using Jet Multiplicity Cuts 

Fig. U shows the multiplicity of jets, Nj, (with Rj =0.7 and Et > 15 GeV) for the top-pair 
signal and the QCD multi-jet background. One obvious way to enhance the top-pair signal 
over the background is to demand the events to have a minimum number of jets, Nj(min) 
(usually taken to be five). Table ||] shows that after a jet multiplicity cut there are about 
360 signal events and roughly 460, 000 background events (in 100/pb) for the reconstructed 
mass range M t i > 300 GeV. The background is about a factor of 1,200 times larger than 
the signal. 

Fig. P shows the top-pair invariant mass reconstructed from the "fat" jets in the event 
(with Exijet) > 15 GeV) for the top-pair signal (multiplied by 200) and the QCD multi-jet 
background after a jet multiplicity cut (Nj > 5). A problem that arises when using a jet 
multiplicity cut is that the cut causes an artificial peak in the background invariant mass 
near the peak in the signal. Requiring a minimum number of jets with transverse energy 
greater than 15 GeV removes events with low parton-parton invariant mass. In addition, 
jet multiplicity cuts are "quantized" (i.e., discrete). One cannot smoothly vary the degree 
of the cut to, for example, optimize signal over background. 
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200,000 T 



175 GeV Top Quark 
1.8 TeV Proton-Antiproton Collisions 
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■ Top Signal x 200 □ QCD Jets Background | 



Figure 9: Shows the reconstructed top-pair invariant mass, M t t, for 175 GeV top quarks produced in 
1.8 TeV proton- ant iprot on collisions together with the QCD multi-jet background for events that have 
survived the jet multiplicity cut, Nj > 5. The invariant mass is constructed from all the jets (Rj — 0.7) in 
the event with Erijet) > 15 GeV. The plot shows the number of events (with C = 100/pb) in a 50 GeV 
bin. The top-pair signal has been multiplied by a factor of 200. 

6.2 Using H e Cuts Without Jets 

In this section, we will examine a method for isolating the top-pair signal over the back- 
ground without defining jets at all. The calorimeter cell information is used directly to 
select the events and to reconstruct the top-pair invariant mass. The six modified Fox- 
Wolfram moments Hi, . . . ,Hg constructed from the calorimeter cells, He(cells), are used 
to select events. Events are required to lie in a region of i/^-space defined by Lg < He < Rg 
for £ = 1, . . . , 6. The left, Li, and right, Rg, cuts given in Table ^ were determined from 
our genetic algorithm procedure which maximize the signal over the square root of the 
background. No jet multiplicity cuts are made. 

Fig. shows the top-pair invariant mass reconstructed directly from the calorimeter 
cells (with Exicell) > 5 GeV) for the top-pair signal (multiplied by 200) and the QCD 
multi-jet background after the Hg cuts. Table ||] shows that for the reconstructed mass 
range M t i > 250 GeV there are about 50 signal events and roughly 5, 000 background events 
(in 100/pb). Here the background is about a factor of 100 larger than the signal. For the 
top-pair signal, the invariant mass reconstructed from cells with Ex(cell) > 5 GeV peaks at 
about 275 GeV which is less than the true top-pair mass of 350 GeV. Removing cells with 
transverse energy less than 5 GeV reduces the reconstructed mass from its generated value. 
Nevertheless, this method gives our best statistical significance of 0.8. One is looking for 
a bump above a smoothly falling background and it does not matter if the mass is shifted 
downward. One can always correct the mass after one establishes the signal. 

Furthermore, the use of this method, in principle, does not cause an artificial peak in 
the reconstructed invariant mass for the background. There is a peak in the background in 
Fig. |l(] but it is at much lower mass than the signal and could be eliminated altogether by 
lowering the minimum cell transverse energy of 5 GeV. 
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Figure 10: Shows the reconstructed top-pair invariant mass, M t t, for 175 GeV top quarks produced 
in 1.8 TeV proton-antiproton collisions together with the QCD multi-jet background for events that have 
survived the He(cell) cuts. Events are required to have He's in the region Le < He < Re for I = 1, . . . , 6, 
where the left, Le, and right, Re cuts are given in Table The He(cell)'s and the invariant mass are 
constructed directly from the calorimeter cells using all cells in the event with Exicell) > 5 GeV. Jets are 
never denned and no jet multiplicity cuts are made. The plot shows the number of events (with C = 100/pb) 
in a 50 GeV bin. The top-pair signal has been multiplied by a factor of 200. 



After the events have been selected using the Hi(ceH)'s, one can construct and examine 
the jets in the event. Fig. 11 shows the multiplicity of "fat" jets (Rj = 0.7) with transverse 
energy greater than 15 GeV for the top-pair signal and the QCD multi-jet background for 
events that have survived the Hi(cell) cuts. By selecting events that lie in the region 
of Hi space given in Table || we have selected events with a large number of jets, but 
in a smooth way. The background now peaks at five jets instead of the two jet peak in 
Fig. H and the signal and background jet multiplicities now look similar. Fig. 12 shows the 
top-pair invariant mass, M#, reconstructed from jets for the top-pair signal and the QCD 
multi-jet background for events that have survived the Hi(ceU) cuts. The invariant mass 
is constructed from all the "narrow" jets (Rj = 0.4) in the event with Exijet) > 15 GeV. 
No jet multiplicity cuts are made. Here the invariant mass of the signal peaks at around 
325 GeV and Table ^ shows that the statistical significance is only slightly lower than the 
cell invariant mass case. 



6.3 Using H t Cuts With Jets 

Instead of working with the cells directly, one can define jets from the very beginning 
and do the whole analysis with the jets. The six moments modified Fox- Wolfram moments 
Hi, . . . , Hq constructed from the jets, H^jet), are used to select events. Events are required 
to lie in a region of i?£-space defined by Li < H^ < Rg for t = 1, . . . , 6. Table |2| gives the 
left, Lg, and right, R^, cuts determined from the genetic algorithm (GA) procedure to 
maximize the signal over the square root of the background. The results for both "narrow" 
jets (Rj = 0.4) and "fat" (Rj = 0.7) jet is given in Table |. The "narrow" jets produce 
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Figure 11: Shows the multiplicity of "fat" jets (Rj =0.7) with transverse energy greater than 15GeV 
for the top-pair signal and the QCD multi-jet background for events that have survived the He(cell) cuts. 
Events are required to have He(cell)'s in the region Le < He(cell) < Re for I = 1, ... ,6, where the left, 
Le, and right, Re cuts are given in Table ^. The plot shows the percentage of events with N jets with 
E T (jet) > 15GeV. 
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Figure 12: Shows the reconstructed top-pair invariant mass, M t t, for 175 GeV top quarks produced 
in 1.8 TeV proton-antiproton collisions together with the QCD multi-jet background for events that have 
survived the He(cell) cuts. Events are required to have HA cell)' s in the region Le < He < Re for I = 1, . . . , 6, 
where the left, Le, and right, Re cuts are given in Table g. The He(cell)'s are constructed from all the cells 
in the event with Er(cell) > 5 GeV and the invariant mass is constructed from all the jets (Rj =0.4) in 
the event with Ex(jet) > 15 GeV. No jet multiplicity cuts are made. The plot shows the number of events 
(with C = 100/pb) in a 50 GeV bin. The top-pair signal has been multiplied by a factor of 200. 
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better results than the "fat" jets. 

Fig. ^ shows the top-pair invariant mass reconstructed from the jets (with Exijet) > 15 
GeV and Rj = 0.4) for the top-pair signal (multiplied by 200) and the QCD multi-jet 
background after the Hi(jet) cuts. Table || shows that for the reconstructed mass range 
M t i > 300 GeV there are about 100 signal events and roughly 17, 000 background events 
(in 100/pb). The background is about a factor of 170 larger than the signal which is 
comparable to, but slightly worse than we get from using the cells directly. 



Multi-Jet Invariant Mass 



Rj = 0.4 ET(jet) > 15 GeV 
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Figure 13: Shows the reconstructed top-pair invariant mass, M t i, for 175 GeV top quarks produced 
in 1.8 TeV proton-antiproton collisions together with the QCD multi-jet background for events that have 
survived the Hi(jet) cuts. Events are required to have He(jet)'s in the region Li < Hi(jet) < Ri for 
£ — 1, . . . , 6, where the left, Li, and right, Re cuts are given in Table |^. The He(jet)'s and the invariant 
mass are constructed from all the jets (Rj = 0.4) in the event with Et (jet) > 15 GeV, but no jet multiplicity 
cuts are made. The plot shows the number of events (with C = 100/pb) in a 50 GeV bin. The top-pair 
signal has been multiplied by a factor of 200. 



7 Summary and Conclusions 

It is difficult to completely isolate the six-jet decay mode of top-pair production over the 
QCD multi-jet background at hadron colliders without b-quark tagging. We are able to 
reduce the background over the signal to less than a factor of 100 using purely topological 
methods and without the use of b-quark tagging. B-quark tagging would, of course, further 
enhance the signal to background ratio. Our technique can be summarized as follows: 

• Construct six modified Fox- Wolfram Moments, Hi, . . . ,Hq, directly from the 
calorimeter cells or from jets. 

• Select events that lie in a certain region of i/^-space defined by < Hp < Ri for 
£ = 1,...,6. 

• Determine the left, Lg, and right, R#, cuts using a genetic algorithm (GA) procedure 
that maximizes the signal over the square root of the background. 
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• Construct the top-pair invariant mass, M t j , directly from the calorimeter cells or from 
jets. 

We do not make a jet multiplicity cut. Jet multiplicity cuts cause an artificial peaking of the 
background invariant mass near the 2M top peak of the signal, whereas requiring events to 
lie in a region of six-dimensional -f^-space, in principle, does not. Requiring the Hps to be 
small does select events with a large number of jets, but in a smooth way. Also, Hi cuts can 
be continuously varied, where jet multiplicity cuts are discrete. Furthermore, the modified 
Fox-Wolfram moments, Hi,...,Hq, can be constructed directly from the calorimeter cells 
without the need to define jets. 

We have used the six-jet decay mode of top-quark pair production hadron colliders as an 
example of our techniques. Other parton-parton subprocesses can be isolated by selecting 
the regions of .f^-space that correspond to their unique topology. For example, many super- 
symmetric subprocesses have characteristic event topologies where our techniques should 
also help to improve the signal to background ratio. 
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